-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MESOS: check for errors during resource allocation in the scheduler plugin #15976
MESOS: check for errors during resource allocation in the scheduler plugin #15976
Conversation
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
@jdef PTAL I tracked my changes with single atomic commits. Also please let me know if you have an idea on how to bump up code coverage with some test scenario of |
@@ -341,7 +348,9 @@ func (k *kubeScheduler) doSchedule(task *podtask.T, err error) (string, error) { | |||
} | |||
|
|||
task.Offer = offer | |||
k.api.algorithm().Procurement()(task, details) // TODO(jdef) why is nothing checking the error returned here? | |||
if err := k.api.algorithm().Procurement()(task, details); err != nil { | |||
return "", err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to mention earlier, we probably want to offer.Release()
before returning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and task.Reset()
Labelling this PR as size/M |
82ddd59
to
c726d2e
Compare
@jdef PTAL added |
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
2 similar comments
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") If this message is too spammy, please complain to ixdy. |
@@ -278,7 +278,11 @@ func (k *kubeScheduler) Schedule(pod *api.Pod, unused algorithm.NodeLister) (str | |||
log.Infof("aborting Schedule, pod has been deleted %+v", pod) | |||
return "", noSuchPodErr | |||
} | |||
return k.doSchedule(k.api.tasks().Register(k.api.createPodTask(ctx, pod))) | |||
task, err := k.api.tasks().Register(k.api.createPodTask(ctx, pod)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also Register has this strange type with an error as second argument. Can we fix this as well here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 will do, indeed a strange pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 6a16fe3
ping me when you need a review on this. |
doSchedule currently accepts err values from previous invocation delegating error handling in a location different from the caller which can be hard to debug and is not a good practice. We still maintain the same invariants after the refactoring. If an err happened in a previous invocation to Register, the returned task object was nil causing task.AcceptedOffer() to return false. By not invoking doSchedule in case of an error we can eliminate the first `err == nil` check in doScheduler.
The variable `offer` is shadowed in the if block causing the `then` branch to be non-effective.
c726d2e
to
6a16fe3
Compare
Labelling this PR as size/L |
lgtm |
GCE e2e test build/test passed for commit 6a16fe3. |
@jdef PTAL |
GCE e2e test build/test passed for commit 9b7e405. |
lgtm |
Continuous integration appears to have missed, closing and re-opening to trigger it |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e test build/test passed for commit 9b7e405. |
Automatic merge from submit-queue |
Auto commit by PR queue bot
…r-checking Rebased previously reverted merge, just before this scheduler refactoring. Auto commit by PR queue bot
…r-checking Rebased previously reverted merge, just before this scheduler refactoring. Auto commit by PR queue bot
xref mesosphere/kubernetes-mesos#475